On ranking relevant entities in heterogeneous networks using a language-based model

نویسندگان

  • Laure Soulier
  • Lamjed Ben Jabeur
  • Lynda Tamine
  • Wahiba Bahsoun
چکیده

A new challenge that consists in accessing to multiple relevant entities arises from the availability of linked heterogeneous data. In this paper, we address more specifically the problem of accessing to relevant entities, such as publications and authors within a bibliographic network, considering an information need. We propose a novel algorithm, called BibRank, that estimates a joint relevance of documents and authors within a bibliographic network. This model ranks each type of entity using a score propagation algorithm with respect to the query topic and the structure of the underlying bi-type information entity network. Evidence sources, namely content-based and networkbased scores, are both used to estimate the topical similarity between connected entities. For this purpose, authorship relationships are analysed through a language model-based score on the one hand and on the other hand, non-topically related entities of the same type are detected through marginal citations. The article reports the results of experiments using Bibrank algorithm within an information retrieval task. CiteSeerX bibliographic dataset forms the basis for the topical query automatic generation and evaluation. We show that statistically significant improvement over closely related ranking models is achieved. Laure Soulier · Lamjed Ben Jabeur · Lynda Tamine ·Wahiba Bahsoun IRIT University of Toulouse, 118 route de Narbonne, 31062 Toulouse, France E-mail: soulier,jabeur,tamine,[email protected]

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new approach based on data envelopment analysis with double frontiers for ranking the discovered rules from data mining

Data envelopment analysis (DEA) is a relatively new data oriented approach to evaluate performance of a set of peer entities called decision-making units (DMUs) that convert multiple inputs into multiple outputs. Within a relative limited period, DEA has been converted into a strong quantitative and analytical tool to measure and evaluate performance. In an article written by Toloo et al. (2009...

متن کامل

Comparing the Efficiency of Task-based Interactive Language Teaching and Task-based Language Teaching on Language Learners’ Fear of Negative Evaluation in University Heterogeneous Classes

Psychological barriers have always had negative effects on English learning. This research was done to compare the efficiency of TBILT and TBLT on learners’ fear of negative evaluation. The statistical population included all 4200 Babol Azad University students of whom 320 were volunteers to participate in English language classes via public invitation. Then, 90 students were selected using ava...

متن کامل

Investigating the Impact of Authors’ Rank in Bibliographic Networks on Expertise Retrieval

Background and Aim: this research investigates the impact of authors’ rank in Bibliographic networks on document-centered model of Expertise Retrieval. Its purpose is to find out what kind of authors’ ranking in bibliographic networks can improve the performance of document-centered model.   Methodology: Current research is an experimental one. To operationalize research goals, a new test colle...

متن کامل

A Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features

Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...

متن کامل

ارائه الگوریتمی مبتنی بر یادگیری جمعی به منظور یادگیری رتبه‌بندی در بازیابی اطلاعات

Learning to rank refers to machine learning techniques for training a model in a ranking task. Learning to rank has been shown to be useful in many applications of information retrieval, natural language processing, and data mining. Learning to rank can be described by two systems: a learning system and a ranking system. The learning system takes training data as input and constructs a ranking ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • JASIST

دوره 64  شماره 

صفحات  -

تاریخ انتشار 2013